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ABSTRACT 


Building on prior work visualizing player behavior using 
interaction networks [1], we examined whether measures of 
implicit science learning collected during gameplay were 
significantly related to changes in external pre-post assessments of 
the same constructs. As part of a national implementation study, 
we collected data from 329 high school students playing an optics 
puzzle game, Quantum Spectre, and modeled their gameplay as an 
interaction network, examining errors hypothesized to be related 
to a lack of implicit understanding of the science concepts 
embedded in the game. Hierarchical linear modeling (HLM) 
showed a negative relationship between the science errors 
identified during gameplay and implicit science learning. These 
results suggest Quantum Spectre gameplay behaviors are valid 
assessments of implicit science learning. Implications for how 
gameplay data might inform classroom teaching in-game 
scaffolding is discussed. 
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1. INTRODUCTION 


As digital games become increasingly prevalent in today’s society 
and are played by the majority of youth of all demographics [2], it 
behooves us to study how the energy and passion invested in 
gaming can be harnessed for productive purposes. Game-based 
learning interests education researchers and learning scientists 
because digital games uniquely engage learners and because their 
data logs can serve as input for innovative learning assessments 
[3]. Data logs generated through gameplay can be used to study 
players’ in-game activity [4] and how game-based learning can be 
leveraged for classroom learning. Research shows that elements 
of gameplay can invoke complex thinking such as scientific 
inquiry [5] and may foster learning-related skills such as creativity 
and persistence [4]. 
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This work examines complex behaviors of students solving optics 
puzzles in the educational game Quantum Spectre, using 
interaction networks. An Interaction Network is a complex 
network representation of all observed player-game interactions 
for a given problem or task in a game or tutoring system [6]. 
Regions of the network can be discovered by applying network 
clustering methods. These regions correspond to high-level 
student approaches to problems [7]. In this work, we used 
Interaction Networks as visualizations to analyze Quantum 
Spectre gameplay data and automated the coding of game states 
that correspond to incorrect applications of the game's core 
science concepts. Three types of errors were coded: two science 
errors (placement and rotation) and puzzle errors. 


This paper reports HLM analyses that relate those coded game 
states to implicit science learning measured by external pre/post 
assessments. The analyses examine how game-based learning is a 
function of what players do in the game, not simply duration of 
gameplay or highest level reached. This information is useful for 
building an adaptive version of the game to scaffold players’ 
implicit science learning and for informing teachers about 
important aspects of student competency. 


2. IMPLICIT SCIENCE LEARNING 


Polanyi argued that implicit knowledge (also called tacit 
knowledge) is foundational and a required element of explicit 
learning [8]. Implicit understandings are embodied and enacted 
through our interactions with the world around us, but may not yet 
be formalized or expressed verbally or textually. Vygotsky 
described similar abilities and understandings a learner brings to a 
learning situation that can be scaffolded by a teacher, 
environment, and tools [9]. Implicit misunderstandings (often 
called misconceptions) may get in the way of a learner’s 
conceptual development [10, 11], particularly in the area of basic 
physics, such as Newton’s Laws of Motion. The work of diSessa 
distinguishes between the intuitive knowledge that novices hold— 
a book will not fall through a table or a glowing filament is hot— 
from an expert understanding of these phenomena, explaining that 
while learners’ behaviors may be guided by _ implicit 
understandings, the learner is not necessarily ready to express the 
related formalisms or question the ideas in a deeper sense [12]. 


Games promise to reveal implicit learning because they can be (a) 
“sticky”—meaning they encourage players to dwell in the 
phenomena and (b) they leave a digital trail that reveals the 
patterns the players used in their learning process. Several 
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researchers have used educational data mining techniques within 
an Evidence-Centered Design framework to develop stealth 
assessments that discern evidence of learning from the vast 
amount of click data generated by online science games such as 
SimCityEDU [13], Physics Playground [14], and Surge [15]. 


As players “level up” in a game, they typically deal with the 
mechanics in increasingly complex applications, building implicit 
knowledge about the underlying system. Because games allow 
players to fail, repeat, revise, and try again—recording what 
players do in the process—games may be powerful formative 
assessments of learning, and the strategies players build. The 
methods players use to tackle new challenges may demonstrate 
conceptual understanding that the learner may not express in other 
ways and that may not be measured by current external learning 
assessments [4, 16]. Careful alignment of game mechanics with 
learning and assessment mechanics [17] may reveal implicit 
learning and empower teachers and learners to help bridge game- 
based knowledge to other forms of learning. 


In a classroom, teachers may be able to build on implicit game- 
based learning if they have the right information and tools to 
support students at key moments in the learning process. That 
may consist of real-time information, provided during class to 
know who is struggling and needs attention, or more reflective 
information after school to help plan lessons for the next day 
based on class gameplay [18]. Post-game debriefing and 
discussions connecting gameplay with classroom learning help 
students apply and transfer learning that takes place in games 
[19]. To exploit learning that happens in games, teachers need to 
build bridges between the students’ “aha” moments while playing 
[20] and the content being covered in the classroom. 


3. QUANTUM SPECTRE 

To examine implicit science game-based learning, we studied 
high school students playing a Physics-oriented game called 
Quantum Spectre. Quantum Spectre is a puzzle-style game, 
designed for play in browsers and on tablets (Figure 1). 
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Figure 1: Quantum Spectre Puzzle 21. Players must direct the 
laser beams to the matching colored targets using movable 
mirrors and other optical devices, selected from the inventory 
on the right. 


Players use optical devices, such as lenses and mirrors, to guide 
colored laser beams to matching targets. The lenses and mirrors 
can be flat, convex, or concave and single or double-sided. All 
devices produce scientifically accurate results when interacting 
with the laser beams. When the laser beams in a puzzle reach the 
matching colored targets, the puzzle is solved (i.e., goal state is 


reached) and the player is scored on the number of moves used. 
The player earns three stars if the puzzle has been solved in the 
optimal number of moves, two stars for a low number of extra 
moves, and one star for simply solving the puzzle. Regardless of 
their score, players can proceed onto the next level, but players 
can repeat earlier levels at any time to improve their performance. 


The game is divided into 6 zones with 30 puzzles in each zone. In 
Zone | of Quantum Spectre, the puzzles focus on 2 key concepts: 


¢ The Law of Reflection, or Angle of Incidence equals Angle 
of Reflection—When reflecting off of a smooth surface, the 
path of a ray of light (such as a laser beam) will make the 
same angle with the surface (relative to the normal) upon exit 
as it makes upon entry. 


¢  Slope—Players can use the squares on the game grid and 
calculate the slope (rise over run) to figure out and/or predict 
the paths of laser beams and where to place items. 


This study focuses on data from Puzzles 14-23 in Zone 1 of the 
game. At this point in gameplay, players have presumably 
mastered the game mechanic, and mastery of the puzzles typically 
requires an understanding of Slope and the Law of Reflection. 
Table 1 provides an overview of Puzzles 14-23. The number of 
goal states reflects the number of unique solutions (position- 
rotation combinations) for each puzzle. 


Table 1: Quantum Spectre Puzzles 14-23 


Game # Mirrors | # Targets # Optimal # Goal 
Level Moves States 
14 1 1 2 1 
15 2 1 4 5 
16 2 1 3 8 
17 2 2 4 1 
18 2 2 4 6 
19 4 4 7 4 
20 6 3 12 42 
21 6 5 11 6 
22 3 1 6 1 
23 4 2 8 3 


4. CLASSIFYING GAMEPLAY 
BEHAVIORS USING INTERACTION 
NETWORKS 


To simplify the vast number of puzzle solution paths into a 
manageable group we could study, we used a method called 
Interaction Networks (INs). INs use a complex network data 
structure to represent players’ solutions as traces of game states 
and actions, with additional information such as edge labels (e.g., 
labels of player actions). This process involved 4 key steps [1]: 
creating a full IN for each puzzle, clustering player actions using 
laser shapes, classifying clusters for evidence of implicit science 
understanding, and automating coding of player actions. 


4.1 Create Full Interaction Network 

To construct an IN, we collected the set of all solution attempts 
for that puzzle. Each interaction is defined as Initial State, Action, 
and Resulting State, from the start of the puzzle until the player 
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solves the puzzle or exits the system. A sample trace is shown in 
Figure 2. Player actions are represented as edges in the network. 


Figure 2: Sample trace of player actions in Quantum Spectre 
Puzzle 18 of Zone 1 


Table 2 describes the complexity of the full interaction networks 
for Puzzles 14-23 for the full sample of students playing the game. 
The full IN of every state and every action taken was large, 
complex, and difficult to interpret in terms of player 
understanding. 


Table 2: Interaction Networks in Puzzles 14-23 


Game | # Total # ee Sainne # Laser 
Level Players | Moves Edges States Shapes 
14 479 3003 462 164 5 
15 473 3866 1009 484 10 
16 462 3218 761 446 12 
17 454 10878 1899 1067 21 
18 439 10314 3458 1800 22 
19 416 15389 7093 4550 330 
20 384 10778 4947 2391 264 
21 349 23080 13919 6261 696 
22 282 3697 1500 1017 146 
23 271 10529 6154 4138 364 


4.2 Cluster States by Laser Shapes 


Most puzzles have states in which different configurations of 
objects result in similar output. These states could be considered 


equivalent since they show the same player proficiencies or errors, 
but a simple state representation would consider them as different 
states. In previous work using INs for games, it has been helpful 
to consider the output of a state as well as the position/orientation 
of objects in that state [7]. To group these equivalent states, we 
took a similar approach, using “laser shape” as part of our state 
representation to create Approach Maps. Approach Maps are a 
visual summary of the information contained in the interaction 
network [7]. This reduction is created by grouping similar states 
together based on how often students co-visit the states during 
their solution attempts. Here, the approach map consists of a list 
of targets hit by a laser of the appropriate color and a list of angles 
taken by that laser. This allows game states that represent similar 
errors to be effectively grouped together, as shown in Figure 3. 


Flat_Mirror(10,7,0): Flat_Mirror(11,8,0) Flat_Mirror(12,9,0) 
Flat Mirror(8,9,180) Flat_Mlrror(9,10,180) Flat_Mirror(10,11,180) 


Figure 3: Using laser shape to group similar game states in 
Puzzle 18. 


This approach preserves the relevant properties of a board state 
while ignoring distance traveled, which is not relevant to the game 
state. 


4.3. Classify Player Actions for Implicit 


Science Understanding 

A Quantum Spectre game designer who has a science education 
background, worked with a researcher to classify each laser shape 
into one of three categories: 


1) Correct move—placement and rotation of the mirror are 
consistent with an eventual goal state 


2) Placement errors—placement of the mirror in a location that 
does not match a goal state—may indicate a lack of 
understanding of slope. 


3) Rotation errors—rotation of a mirror to an angle that does 
not match a goal state—may indicate a lack of understanding 
of the Law of Reflection. 


As described elsewhere [1] using a subset of these data, the game 
designer and researcher also identified placements that were not 
consistent with a goal state but were more indicative of a lack of 
grasp of the puzzle mechanic than of a lack of science 
understanding. We labeled these Puzzle errors. For example, in 
puzzle shown in Figure 2, a correct solution requires players to 
use the two available mirrors to direct the laser through the two 
targets simultaneously. In Figure 4, player actions are consistent 
with someone who understands slope (i.e., they placed the mirror 
on the path of the laser) and the Law of Reflection (i.e., they 
rotated the mirror to reflect the mirror through the target). 
However, their actions are not going to let them solve this puzzle. 


Flat. 
Flat J 


(6,30). 30) 
(0,9,270) Flat.] 8.270) 


Figure 4: Sample Puzzle Errors in Puzzle 18. 
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4.4 Automated Coding of Individual Player 


Behaviors 

Once all laser shapes had been coded and puzzle error placements 
identified, we automated the coding of individual player 
behaviors. Every player behavior was classified as a Placement 
Error, Rotation Error, or Puzzle Error (O=Not Present; 1=Present). 
These are mutually exclusive player behaviors. Player actions 
with none of these errors were classified as Correct. Figure 5 
shows the distribution of player behaviors across each puzzle. 
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Figure 5: Error rates by puzzle level 


The percentage of correct moves ranged from 39% in Level 18 to 
79% in Level 20. Placement error rates range from 8% (Level 16) 
to 35% (Level 18). Rotation error rates were most common in 
earlier puzzles, 35% in Level 14 to 1% in Level 21. In two 
puzzles, Levels 14 and 22, no puzzle errors were possible. Puzzle 
errors in the remaining puzzles ranged from 4% (Level 20) to 
36% (Level 23). 


5. RESEARCH QUESTIONS & 
HYPOTHESES 


In this paper, we examine the ways in which the extent of players’ 
puzzle and science errors are related to changes in their 
performance on a pre-post assessment of slope and the Law of 
Reflection. We anticipated a negative relationship between 
placement errors, rotation errors, and pre-post assessment 
results—that is players who are demonstrating a lack of 
understanding of the science concepts in their gameplay will have 
smaller gains than players whose gameplay is consistent with an 
implicit understanding of slope and the Law of Reflection. Our 
anticipated relationship between puzzle errors and pre-post 
assessment results was less clear. It could be that puzzle errors 
interfere with their implicit learning of the science content. It 
could also be players who understand the science content are just 
as likely to make puzzle errors as players without that 
understanding, so there may be no relationship between the 
number of puzzle errors and pre-post assessment results. 


6. METHODS 


Teachers were assigned to one of three groups as part of a national 
Quantum Spectre implementation study. In Bridge classrooms, 
teachers encouraged students to play the game outside of class and 
used examples from the game as part of their science instruction. 
In Game Only classrooms, teachers encouraged students to play 
the game but provide no game examples during their science 
instruction. In Control classrooms, teachers and students did their 
normal science instruction with their students not knowing about 


the game. This paper reports gameplay data from the 329 students 
in 29 classes (14 Bridge and 15 Game Only) that participated in 
the implementation study during the 2013-14 and 2014-15 
academic years. 


6.1 Sample 

Because this study focuses on Puzzles 14-23 in Zone 1 of the 
game, 79 students were excluded from these analyses because 
they did not attempt Puzzle 14 of the game. The final sample of 
329 high school science students included 132 females, 162 
students in Bridge classrooms, 281 students in non-Honors/AP 
classrooms, and 249 students in classrooms where more than 75 
percent of the students participated in the study. 


6.2 Measures 
This study collected gameplay log data, as described above, as 
well as pre-post assessment and student/classroom characteristics,. 


6.2.1 Gameplay Metrics 

To allow for the fact that students (a) used varying numbers of 
moves to solve the puzzles and (b) not all students completed 
Levels 14-23; the percentage of the total number of moves 
(actions) that were correct, placement errors, rotation errors, and 
puzzle errors was calculated. The mean error rate across all 
students was 19% placement errors, 7% rotation errors, and 12% 
puzzle errors. We used standardized (z-scores) error rates. 


The total amount of time each student played Quantum Spectre 
and the highest level reached were also recorded. Previous 
analyses showed Puzzle 21 to have a high dropout rate [21], we 
analyzed whether or not players completing Puzzle 21 had any 
relationship to changes in pre-post assessment results. Among this 
sample, there was no significant difference in the percentage of 
students in Bridge and Game Only classrooms that reached Puzzle 
22 (X°=3.53, 1 d.f., p=0.06). Given the non-normal distribution of 
the amount of time students played Quantum Spectre, we 
categorized students as having played less than 1 hour, or 1 hour 
or more. Forty-one percent of students played 1 hour or more, this 
proportion did not vary among students in Bridge and Game Only 
classrooms (X°=3.23, 1.d.f., p=0.07). 


6.2.2 Students & Classroom Characteristics 

When completing the pre-assessment, students were asked to 
indicate their gender. We categorized class names (e.g., Honors 
Physics 101) obtained from teacher applications as being either 
Honors/AP classes or not. Seven of the 29 classes in this study 
were Honors/AP classes. Finally, we asked teachers the total 
number of students enrolled in each class. We calculated the 
percentage of the class with complete study information (e.g., 
complete consent/assent forms, pre-post assessments complete, 
and gameplay beyond Puzzle 1 in Zone 1). This ranged from 31 to 
100 percent of each class, with the majority of classes (26) having 
more than half of the students participating. 


6.2.3 Assessments 

Science content experts developed assessment instruments and 
tested them in a series of think-aloud interviews with 10 high 
school students. Each assessment contained 12 (pre) and 13 (post) 
questions that required minimal formalisms to complete. The pre- 
and post-assessments each included 3 items related to focal length 
that are not included in these analyses. Figures 6 and 7 are sample 
items for slope and the Law of Reflection, respectively. In Figure 
6, students are asked which point (A-D) a line drawn through the 
two black points would hit. The item in Figure 7 asks students 
which letter each laser would hit. 
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Figure 6: Sample Slope assessment item 


fraser 4) |e [Jefe] [|| 


Figure 7: Sample Law of Reflection assessment item 


These analyses are limited to the 9 pre and 10 post items focused 
on slope and the Law of Reflection. These pre- and post- 
assessment items had good internal consistency (Cronbach’s alpha 
was 0.70 (pre) and 0.73 (post). To account for the different 
number of items, we used the percentage of items answered 
correctly in the analyses. Students answered an average of 53 
percent of the pre assessment items and 59 percent of the post 
assessment items correctly. Students in Bridge classrooms, 
however, answered significantly fewer questions correctly on both 
the pre- and post-assessment than students in Game Only 
classrooms (F=19.2, 1, 132 d.f., p<0.01). On average, students in 
Bridge classrooms answered 48 percent of the pre-assessment and 
55 percent of the post-assessment items. In contract, students in 
Game Only classrooms answered 58 percent of the pre assessment 
items and 63 percent of the post assessment items correctly. 


7. RESULTS 

Using the SPSS MIXED linear models procedure, HLM analyses 
began with an unconditional 3-level model with students, 
classrooms, and teachers using Restricted Maximum Likelihood 
(REML) and unstructured covariances. In the 3-level model, 
seven percent of the variation was at the teacher level. Triple that 
proportion of the overall variation was attributable to the 
classroom level. A 2-level unconditional model with students 
nested within classrooms was estimated. In that model, a 
statistically significant 34 percent of the variance in the post- 
assessment was attributable to classroom level variation. 


Sets of covariates were added to the unconditional HLM model in 
this order: 


Set 1. Pre-assessment score (standardized) 
Set 2. Study Group (Bridge or Game Only) 
Set 3. Student gender (1=Female) 


Set 4. Classroom Level Characteristics: Whether or not they were 
enrolled in class in which more than half of the students 
completed the study (1=Yes); whether or not they were enrolled in 
an AP/Honors science class (1=Yes) 


Set 5. In-game measures of implicit understanding—% Placement 
Errors, % Rotation Errors, and % Puzzle Errors (all standardized) 


Set 6. Gameplay duration (>1 hour vs. not) and highest level 
reached (Level 22 vs. not) 


Only statistically significant covariates were retained in the HLM 
model presented in this paper. Sets 3, 4, and 6 had no significant 
results, meaning student gender, Honors/AP status, gameplay 
duration and highest level reached were not significantly related 
to changes in pre-post assessment scores. 


The model with the in-game measures of implicit understanding 
of slope and the Law of Reflection were a significantly better fit 
than the model without those measures (X° (3 df, N=317), 6.76, 
p<0.10). The best-fitting HLM model, which accounts for 33 
percent of the variation at the classroom level, is presented in 
Table 3. Overall, after accounting for students’ performance on 
the pre-assessment, students who exhibited more Placement and 
Rotation errors while playing the game performed more poorly on 
the post than students with lower science error rates. 


Table 3: Best-fitting HLM model 


95% 
Confidence 
Interval 
Std 
Parameter Est. Err df | Sig. Lower | Upper 
Intercept 0.10 | 0.12 24 | 0.43 -0.15 0.35 
Pre- 
Assessment! 0.35 | 0.05 | 320 | 0.00 0.26 0.45 
Bridge 
(vs. Game 
Only) -0.17 | 0.17 25 | 0.33 -0.52 0.18 
%Placement 
Errors! -0.08 | 0.05 | 304 | 0.09 -0.17 0.01 
%Rotation 
Errors! -0.17 | 0.05 | 320 | 0.00 -0.26 -0.07 
%Puzzle 
Errors! 0.00 | 0.04 | 310 | 0.93 -0.09 0.08 
‘Standardized 


The intercept coefficient represents the estimated outcome for 
male students who scored at the mean level of the pre-assessment, 
were in the Game Only group, were not in a Honors/AP class, and 
had mean levels of Placement and Rotation Errors. These students 
would score 0.07 standard deviations below the mean post- 
assessment score. The Pre-Assessment coefficient reflects the 
change in number of standard deviations of the post-assessment 
for every increase of | standard deviation on the pre-assessment. 
For every standard deviation increase on the pre-assessment, 
students would be expected to score 0.35 standard deviations 
higher on the post-assessment. Students in Bridge classes scored 
0.17 standard deviations lower on the post—assessment than 
students in Game Only classes—a non-significant difference. 
There was no significant difference between Bridge and Game 
Only groups in their pre-post gains. This may be because Game 
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Only classroom instruction provided lab experiences with lasers 
that mirrored what Bridge classrooms did with Quantum Spectre, 
providing comparable experiences and similar gains. 


Students whose placement or rotation error rate was one standard 
deviation above the mean, however, had post-assessment scores 
0.08 and 0.17 standard deviations below the mean, respectively. 
There was no impact of puzzle errors. Interactions between study 
group (Bridge vs. Game Only) and gameplay errors were 
examined but none significantly improved the fit of the HLM 
model, suggesting the impact of these errors was the same across 
study groups. 


8. DISCUSSION & IMPLICATIONS 


Hierarchical linear modeling suggest a direct negative relationship 
between science-related gameplay errors and implicit science 
learning—players making errors consistent with a lack of implicit 
science understanding performed worse than players not making 
as many of those errors. Educators can use this information as a 
real-time, or reflective, formative assessment tool. This could be 
very useful in a class where students are playing a learning game, 
individually or in groups, while the teacher has an app that alerts 
them to which students are struggling and may need attention. A 
more comprehensive dashboard they can use after class might 
show them overall progress of their class and trends that inform 
how the next lessons are planned. Teachers might also use a 
dashboard to monitor their students’ game-based learning as they 
play at home or with friends outside of class. The ability to validly 
infer implicit science learning from the digital records of game 
activity makes this all possible. 
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